The bayesynergy R package implements a Bayesian semi-parametric regression model for estimating the dose-response function of in-vitro drug combination experiments. The Bayesian framework offers full uncertainty quantification of the dose response function and any derived summary statistics, as well as natural handling of replicates and missing data. The Bayesian model is implemented in Stan (Stan Development Team (2020)), taking advantage of the efficient ‘No U-Turn Sampler’ as well as variational Bayes for quick approximations of the true posterior.
The package is further equipped with plotting functions for summarizing a drug response experiment, parallel processing for large drug combination screen, as well as plotting tools for summarizing and comparing these.
The dose-response function \(f:\boldsymbol{x} \to (0,1)\), maps drug concentrations \(\boldsymbol{x}\) to a measure of cell viability – zero corresponding to all cells being dead after treatment, one corresponding to all cells still alive. In drug-combination screens, it is common to assume that the dose-response function can be broken down as \[ f(\boldsymbol{x}) = p_0(\boldsymbol{x})+\Delta(\boldsymbol{x}), \] where \(p_0(\boldsymbol{x})\) encodes a non-interaction assumption, and \(\Delta(\boldsymbol{x})\) captures the residual interaction effect.
The non-interaction assumption, \(p_0(\boldsymbol{x})\), captures what can be reasonably assumed about a joint drug effect, given estimates of the drugs’ individual effect. We assume a Bliss style independence assumption, where we first assume that the individual drugs’ dose-response function takes the form of a log-logistic curve \[ h_i(x_i|l,s,m) = l + \frac{1-l}{1+10^{s(x_i-m)}}, \] where \(l\) is the lower-asymptote, \(s\) the slope, and \(m\) the drugs ‘EC-50’ on the \(\log_{10}\) scale. The Bliss assumption then amounts to a probabilistic independence assumption, where \[ p_0(\boldsymbol{x}) = h_1(x_1|l_1,s_1,m_1) \ h_2(x_2|l_2,s_2,m_2). \] We call it probabilistic, because we can interpret the individual dose-response curves, \(h_i()\) as probability of cell survival. Defining the events \[ \begin{align} A_i & = \text{A cell survives drug A at concentration $x_{1i}$} \\ B_j & = \text{A cell survives drug B at concentration $x_{2j}$} \\ C_{ij} & = \text{A cell survives both drugs at concentration $\boldsymbol{x}=(x_{1i},x_{2j})$}, \end{align} \] the corresponding probabilities become \[ p_0(\boldsymbol{x}) = P(C_{ij}) = P(A_i)P(B_i) = h_1(x_1|l_1,s_1,m_1) \ h_2(x_2|l_2,s_2,m_2). \]
The interaction component, \(\Delta(\boldsymbol{x})\), captures any joint effect of the drugs that is not captured by the non-interaction assumption. If two drugs are more effective together than it would be expected by \(p_0\), we call it synergy, which corresponds to \(\Delta <0\). The opposite effect is deemed antagonism.
Because the interaction landscape can be complex, with multiple local peaks and valleys, we model this term non-parametrically using a Gaussian Process prior (GP). To ensure that the resulting dose-response function only takes values in the interval \((0,1)\), we push the GP through a transformation function \(g()\). That is \[ z(\boldsymbol{x}) \sim \mathcal{GP}(0,\kappa(\boldsymbol{x},\boldsymbol{x}')) \\ \Delta(\boldsymbol{x}) = g(z(\boldsymbol{x})), \] where the transformation function looks like \[ g(z(\boldsymbol{x})) = \frac{-p_0(\boldsymbol{x})}{1+\exp\left\{b_1z(\boldsymbol{x})+\log\left[\frac{p_0(\boldsymbol{x})}{1-p_0(\boldsymbol{x})}\right]\right\}} + \frac{1-p_0(\boldsymbol{x})}{1+\exp\left\{-b_2z(\boldsymbol{x})-\log\left[\frac{p_0(\boldsymbol{x})}{1-p_0(\boldsymbol{x})}\right]\right\}}. \] In addition to ensuring the proper bounds for the dose-response function, this transformation has the feature of \(g(0)=0\), which corresponds to an a priori assumption that \[ \mathbb{E}\left[f(\boldsymbol{x}) | p_0(\boldsymbol{x})\right] \approx p_0(\boldsymbol{x}). \] That is, we make our non-interaction assumption into a formal prior expectation on the dose-response function. This achieves two things, (1) a slightly conservative model that needs to be convinced that interaction effects are present, and (2) no built-in bias of interaction in the prior structure.
The covariance function \(\kappa(\boldsymbol{x},\boldsymbol{x}')\) can be given multiple specifications, including a squared exponential, Matérn, and Rational Quadratic covariance functions. By default, we use a Matérn covariance with the \(\nu\) parameter set to 3/2 yielding \[ \kappa(\boldsymbol{x},\boldsymbol{x}') = \sigma_f^2\left(1+\frac{\sqrt{3}\Vert\boldsymbol{x}-\boldsymbol{x}'\Vert}{\ell}\right)\exp\left\{-\frac{\sqrt{3}\Vert\boldsymbol{x}-\boldsymbol{x}'\Vert}{\ell}\right\}. \] Finally, by utilizing the natural grid structure of the drug concentrations, we can write the kernel function as \[ \kappa(\boldsymbol{x},\boldsymbol{x}') = \sigma_f^2 \kappa(x_1,x_1')\kappa(x_2,x_2'), \] which induces a Kronecker product structure on the final covariance matrix. Following the implementation detailed in Flaxman et al. (2015), this greatly improves the computational efficiency of the model.
Given the above formulation for the dose-response function \(f\), we assume that we have access to noisy observations from it. These observations are typically generated from various cellular assays, e.g. viability assays. In particular we assume that given concentration points \(\boldsymbol{x}_1,\ldots,\boldsymbol{x}_n\) we have observations \(y_1,\ldots,y_n\) where \[ y_i = f(\boldsymbol{x}_i) + \epsilon_i, \] where we assume that the errors \(\epsilon_i\) are normally distributed with mean zero. For the variance of the observational errors, by default we model these in a heteroscedastic fashion as \[ \text{Var}\left[\epsilon_i\right] = \sigma^2(f(\boldsymbol{x}_i)+\lambda), \] where \(\lambda\) is set to a small value to handle the case when \(f = 0\), but there is still some residual noise. In a typical setup where cell viability is calculated through a normalization to positive and negative controls, lambda can be empirically set as \[ \lambda = \frac{\sigma^2_{+}}{\sigma^2_{-}}, \] where \(\sigma^2_{+}\) and \(\sigma^2_{-}\) denotes the variance of positive and negative controls, respectively.
We choose a heteroscedastic model by default, because in cell viability assays, the observations are normalized in relation to positive and negative controls. The positive controls typically have much lower variance compared to the negative controls, which translates to viability measures closer to zero being more precisely measured. We also allow homoscedastic noise as an option.
The positive and negative controls essentially control the signal-to-noise ratio in cell viability assays. If the user has access to these, they can be included in the model to help calibrate the posterior distribution – particularly in the case with zero replicates.
Let \(\xi^-_k\) and \(\xi^+_l\) denote the negative and positive controls for \(k=1,\ldots,n_-\) and \(l=1,\ldots,n_+\). These measurements are raw readings from the plate and are used to calculate cell viability. For an additional well, treated with drug concentration \(\mathbf{x}_i\), we denote the raw output by \(\xi_i\), and calculate cell viability for this well by the formula: \[ y_i = \frac{\xi_i-\tilde{\xi^+}}{\tilde{\xi^-}-\tilde{\xi^+}}, \] where \(\tilde{\xi^-}\) and \(\tilde{\xi^+}\) denotes some measure of centrality of the positive and negative controls, typically the mean or median.
The controls can themselves be passed through this function and converted to % viability. From the variances of these normalized controls, \(\lambda\) can be set as indicated above. And the negative controls can be added directly into the algorithm. Negative controls represents unhindered cell growth, and can be thought of as samples from the dose-response function \(f(\mathbf{x})\) at concentration \(\mathbf{x}=(0,0)\). These can then be added directly to the \(\texttt{bayesynergy}\) function in the same way as regular observations.
The full model specification, with all default prior distributions look like \[ y_i \sim \mathcal{N}\left(f(\boldsymbol{x}_i),\sigma^2(f(\boldsymbol{x}_i)+\lambda)\right), \ i = 1,\ldots, n \\ \sigma \sim \text{Inv-Ga}\left(5,1\right), \ \lambda = 0.005. \\ f(\boldsymbol{x}_i) = p_0(\boldsymbol{x}_i)+\Delta(\boldsymbol{x}_i) \mathbb{I}(10^{\boldsymbol{x}_i}>0) \\ p_0(\boldsymbol{x}) = h_1(x_1|l_1,s_1,m_1) \ h_2(x_2|l_2,s_2,m_2). \\ l_j = \text{Beta}(1,1.25), \ s_i \sim \text{Gamma}(1,1), \\ m_i \sim \mathcal{N}(\theta_i,\sigma_{m_i}^2), \ j = 1,2 \\ \theta_i \sim \mathcal{N}(0,1), \ \sigma_{m_i}^2 \sim \text{Inv-Ga}\left(3,2\right), \ j = 1,2 \\ \Delta(\boldsymbol{x}) = g(z(\boldsymbol{x})), \ z(\boldsymbol{x}) \sim \mathcal{GP}(0,\kappa(\boldsymbol{x},\boldsymbol{x}')) \\ g(z(\boldsymbol{x})) = \frac{-p_0(\boldsymbol{x})}{1+\exp\left\{b_1z(\boldsymbol{x})+\log\left[\frac{p_0(\boldsymbol{x})}{1-p_0(\boldsymbol{x})}\right]\right\}} + \frac{1-p_0(\boldsymbol{x})}{1+\exp\left\{-b_2z(\boldsymbol{x})-\log\left[\frac{p_0(\boldsymbol{x})}{1-p_0(\boldsymbol{x})}\right]\right\}} \\ \kappa(\boldsymbol{x},\boldsymbol{x}') = \sigma_f^2\left(1+\frac{\sqrt{3}\Vert\boldsymbol{x}-\boldsymbol{x}'\Vert}{\ell}\right)\exp\left\{-\frac{\sqrt{3}\Vert\boldsymbol{x}-\boldsymbol{x}'\Vert}{\ell}\right\}, \\ \sigma_f^2 \sim \text{log-}\mathcal{N}(1,1), \ \ell \sim \text{Inv-Ga}(5,5) \\ b_j \sim \mathcal{N}(1,0.1^2), \ j = 1,2. \] Note that some of these specifications can be altered. For example, by default we estimate the lower asymptotes, but they can also be fixed equal to zero.
In the model specification above, the interaction term is multiplied with an indicator function \(\mathbb{I}(\boldsymbol{x}>0)\) taking the value 1 if and only if all elements in \(\boldsymbol{x}\) is strictly larger than zero. This makes sure that we don’t allow for interaction when one of the drugs is at zero concentration.
From the posterior dose-response function \(f | \mathbf{y}\), we derive a number of summary statistics concerning efficacy, synergy and antagonism.
For the monotherapy curves, we produce estimates of the drug sensitivity score (DSS) of each drug by the integral
\[ DSS_0 = \int_a^b 1-h_j(x) \text{d}x, \] where \(a=h_j^{-1}(1-\epsilon/2)\) and \(b=\max(x_{1j})\). That is, the integral is taken from where \(h_j\) crosses an ‘activation threshold’ \(1-\epsilon/2\), to the maximum value measured of the drug concentration in the screen. The activation treshold \(\epsilon\) is set to 5%. This value is further standardized by the total volume available for drug efficacy, \[ DSS = \frac{DSS_0}{(1-\epsilon/2)(b-a)} \] From here, values can be further standardized as in Yadav et al. (2014)
To summarise the total drug-response function, we utilise the measures developed in Cremaschi et al. (2019). As with the DSS scores, we calculate the area above the response function, and denote this the ‘residual volume under the surface’ or rVUS – as it’s the volume left over from integrating the surface.
In general, the integral looks like
\[ rVUS_0(f) = \int_a^b \int_c^d 1-f(\mathbf{x}) \ \text{d}\mathbf{x} \] where the integrals are taken over the observed drug range, i.e. \(a = \min (x_1)\), \(b = \max (x_1)\), \(c = \min (x_2)\), \(d = \max (x_2)\).
From here, we further standardise the rVUS scores to the total volume available for efficacy
\[ rVUS(f) = \frac{rVUS_0(f)}{(b-a)(d-c)}. \] The model calculates \(rVUS\) for the dose-response function \(f\), giving a measure of combined efficacy. In addition, we calculate \(rVUS(p_0)\), the non-interaction efficacy; \(rVUS(\Delta)\) the interaction efficacy. For the interaction term \(\Delta\), we also compute \(rVUS(\Delta^{-})\) and \(rVUS(\Delta^{+})\) for synergy and antagonism, where \(\Delta^{+}\) and \(\Delta^{-}\) denotes the positive and negative parts of \(\Delta\), respectively. That is,
\[ \Delta^{+}(\mathbf{x}) = \max(0,\Delta(\mathbf{x})) \\ \Delta^{-}(\mathbf{x}) = \max(0,-\Delta(\mathbf{x})). \]
When running screens with a large amount of drug combinations, it is helpful to have a normalised measure for comparing synergy across experiments. The \(rVUS\) scores defined above are already standardized to their drug concentration range, but to compare across experiments, we also standardize with respect to the uncertainty in the model. To do this, we calculate a synergy score by normalizing \(rVUS(\Delta^{-})\) with respect to its standard deviation. \[ \text{Synergy score} = \frac{\text{mean}(rVUS(\Delta^{-}))}{\text{sd}(rVUS(\Delta^{-}))}. \]
In the R package, we’ve attached two example datasets from a large drug combination screening experiment on diffuse large B-cell lymphoma. We’ll use these to show some simple use cases of the main functions and how to interpret the results.
Let’s load in the first example and have a look at it
library(bayesynergy)
data("mathews_DLBCL")
y = mathews_DLBCL[[1]][[1]]
x = mathews_DLBCL[[1]][[2]]
head(cbind(y,x))## Viability ibrutinib ispinesib
## [1,] 1.2295618 0.0000 0
## [2,] 1.0376006 0.1954 0
## [3,] 1.1813851 0.7812 0
## [4,] 0.5882688 3.1250 0
## [5,] 0.4666700 12.5000 0
## [6,] 0.2869514 50.0000 0
We see that the the measured viability scores are stored in the vector y, while x is a matrix with two columns giving the corresponding concentrations where the viability scores were read off.
Fitting the regression model is simple enough, and can be done on default settings simply by running the following code (where we add the names of the drugs involved, as well as the concentration units for plotting purposes).
##
## SAMPLING FOR MODEL 'gp_grid' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.0002 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 2.48102 seconds (Warm-up)
## Chain 1: 2.59128 seconds (Sampling)
## Chain 1: 5.0723 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'gp_grid' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 6.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.66 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 2.32784 seconds (Warm-up)
## Chain 2: 1.47787 seconds (Sampling)
## Chain 2: 3.80571 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'gp_grid' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 6e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.6 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 2.20523 seconds (Warm-up)
## Chain 3: 2.42381 seconds (Sampling)
## Chain 3: 4.62903 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'gp_grid' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 6.3e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.63 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 2.49997 seconds (Warm-up)
## Chain 4: 5.02691 seconds (Sampling)
## Chain 4: 7.52688 seconds (Total)
## Chain 4:
The resulting model can be summarised by running
## mean se_mean sd 2.5% 50% 97.5% n_eff Rhat
## la_1[1] 0.330 0.003036 0.0821 1.25e-01 3.42e-01 0.460 731 1.004
## la_2[1] 0.386 0.002718 0.0617 1.74e-01 3.96e-01 0.460 516 1.010
## log10_ec50_1 0.484 0.005910 0.1704 2.33e-01 4.49e-01 0.931 831 1.002
## log10_ec50_2 -1.024 0.019946 1.0066 -3.38e+00 -8.92e-01 0.488 2547 1.000
## slope_1 1.986 0.019985 0.9413 8.01e-01 1.76e+00 4.303 2218 1.003
## slope_2 1.427 0.018249 1.0459 1.14e-01 1.18e+00 4.153 3285 1.002
## ell 3.045 0.032709 1.5341 1.21e+00 2.70e+00 6.870 2200 1.000
## sigma_f 0.857 0.016475 0.8574 1.72e-01 6.17e-01 2.938 2708 1.000
## s 0.105 0.000276 0.0153 8.00e-02 1.03e-01 0.140 3057 1.002
## dss_1 37.873 0.139525 5.7369 2.67e+01 3.80e+01 49.024 1691 1.002
## dss_2 43.914 0.272526 7.8026 2.33e+01 4.53e+01 55.594 820 1.008
## rVUS_f 82.802 0.015023 0.9360 8.09e+01 8.28e+01 84.660 3882 1.000
## rVUS_p0 73.096 0.033670 2.3682 6.82e+01 7.32e+01 77.354 4947 1.000
## rVUS_Delta -9.706 0.037315 2.5462 -1.50e+01 -9.59e+00 -4.980 4656 1.000
## rVUS_syn 9.760 0.036700 2.4970 5.24e+00 9.63e+00 15.016 4629 1.000
## rVUS_ant 0.054 0.002158 0.1267 5.06e-06 8.46e-05 0.439 3448 0.999
##
## log-Pseudo Marginal Likelihood (LPML) = 52.58371
which gives posterior summaries of the parameters of the model. In addition, the model calculates summary statistics of the monotherapy curves and the dose-response surface including drug sensitivity scores (DSS) for the two drugs in question, as well as the residual volumes under the surface (rVUS) that capture the notion of efficacy (rVUS_f), interaction (rVUS_Delta), synergy (rVUS_syn) and interaction (rVUS_ant).
As indicated, the total combined drug efficacy is around 80% (rVUS_f), of which around 70 percentage points can be attributed to \(p_0\) (rVUS_p0), leaving room for 10 percentage points worth of synergy (rVUS_syn). We can also note that the model is fairly certain of this effect, with a 95% credible interval given as (5.237, 15.016).
We can also create plots by simply running
which produces monotherapy curves, monotherapy summary statistics, 2D contour plots of the dose-response function \(f\), the non-interaction assumption \(p_0\) and the interaction \(\Delta\). The last plot displays the \(rVUS\) scores as discussed previously, with corresponding uncertainty.
The package can also generate 3D interactive plots by setting plot3D = T. These are displayed as following using the plotly library (Plotly Technologies Inc. (2015)).
The synergyscreen provides a work flow for data from big drug combination screens, where multiple drugs are tested in combination on multiple cell lines. It takes as input a list of experiments, each entry being a list containing the necessary elements needed for a call to the main regression function bayesynergy.
Included in the package is the result of a synergyscreen run of 583 drug combinations on the A-375 human melanoma cell line from ONeil et al. (2016). The synergyscreen object is a list with two entries, a dataframe with parameter estimates from each experiment, and a list entitled failed – containing experiments that either failed completely to process, or had an unsatisfactory fit.
## [1] 2
We see that the dataset has two experiments that failed to process, during an initial run of synergyscreen. There’s a multitude of reasons why an experiment might fail to process, it could be an input error, initialization problems or problems with the parallel processing.
The entries of failed are themselves lists, each containing the necessary information to process through the bayesynergy function
## [1] "y" "x" "drug_names" "experiment_ID"
## viability L778123 MK-4827
## [1,] 0.759 0.325 0.223
## [2,] 0.755 0.325 0.775
## [3,] 0.548 0.325 2.750
## [4,] 0.307 0.325 10.000
## [5,] 0.787 0.800 0.223
## [6,] 0.820 0.800 0.775
We can rerun experiments that failed to process, by simply passing the returned synergyscreen object back into the function. Note that we turn of the default options of saving each fit and plotting everything, and set method = "vb" indicating we use variational inference to fit the model.
fit_screen = synergyscreen(ONeil_A375, save_raw = F, save_plots = F, parallel = F,
bayesynergy_params = list(method = "vb"))## Chain 1: ------------------------------------------------------------
## Chain 1: EXPERIMENTAL ALGORITHM:
## Chain 1: This procedure has not been thoroughly tested and may be unstable
## Chain 1: or buggy. The interface is subject to change.
## Chain 1: ------------------------------------------------------------
## Chain 1:
## Chain 1:
## Chain 1:
## Chain 1: Gradient evaluation took 0.000231 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.31 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Begin eta adaptation.
## Chain 1: Iteration: 1 / 250 [ 0%] (Adaptation)
## Chain 1: Iteration: 50 / 250 [ 20%] (Adaptation)
## Chain 1: Iteration: 100 / 250 [ 40%] (Adaptation)
## Chain 1: Iteration: 150 / 250 [ 60%] (Adaptation)
## Chain 1: Iteration: 200 / 250 [ 80%] (Adaptation)
## Chain 1: Success! Found best value [eta = 1] earlier than expected.
## Chain 1:
## Chain 1: Begin stochastic gradient ascent.
## Chain 1: iter ELBO delta_ELBO_mean delta_ELBO_med notes
## Chain 1: 100 -55.891 1.000 1.000
## Chain 1: 200 144.374 1.194 1.387
## Chain 1: 300 156.579 0.822 1.000
## Chain 1: 400 149.076 0.629 1.000
## Chain 1: 500 161.738 0.519 0.078
## Chain 1: 600 163.019 0.434 0.078
## Chain 1: 700 158.352 0.376 0.078
## Chain 1: 800 159.448 0.330 0.078
## Chain 1: 900 162.109 0.295 0.050
## Chain 1: 1000 160.243 0.267 0.050
## Chain 1: 1100 161.742 0.168 0.029
## Chain 1: 1200 160.932 0.029 0.016
## Chain 1: 1300 157.760 0.024 0.016
## Chain 1: 1400 162.193 0.021 0.016
## Chain 1: 1500 161.941 0.014 0.012
## Chain 1: 1600 162.219 0.013 0.012
## Chain 1: 1700 163.439 0.011 0.009 MEDIAN ELBO CONVERGED
## Chain 1:
## Chain 1: Drawing a sample of size 1000 from the approximate posterior...
## Chain 1: COMPLETED.
## Chain 1: ------------------------------------------------------------
## Chain 1: EXPERIMENTAL ALGORITHM:
## Chain 1: This procedure has not been thoroughly tested and may be unstable
## Chain 1: or buggy. The interface is subject to change.
## Chain 1: ------------------------------------------------------------
## Chain 1:
## Chain 1:
## Chain 1:
## Chain 1: Gradient evaluation took 0.000167 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.67 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Begin eta adaptation.
## Chain 1: Iteration: 1 / 250 [ 0%] (Adaptation)
## Chain 1: Iteration: 50 / 250 [ 20%] (Adaptation)
## Chain 1: Iteration: 100 / 250 [ 40%] (Adaptation)
## Chain 1: Iteration: 150 / 250 [ 60%] (Adaptation)
## Chain 1: Iteration: 200 / 250 [ 80%] (Adaptation)
## Chain 1: Success! Found best value [eta = 1] earlier than expected.
## Chain 1:
## Chain 1: Begin stochastic gradient ascent.
## Chain 1: iter ELBO delta_ELBO_mean delta_ELBO_med notes
## Chain 1: 100 -296.035 1.000 1.000
## Chain 1: 200 18.381 9.053 17.105
## Chain 1: 300 119.532 6.317 1.000
## Chain 1: 400 144.867 4.782 1.000
## Chain 1: 500 145.816 3.827 0.846
## Chain 1: 600 146.854 3.190 0.846
## Chain 1: 700 148.744 2.736 0.175
## Chain 1: 800 150.175 2.395 0.175
## Chain 1: 900 150.858 2.130 0.013
## Chain 1: 1000 148.377 1.918 0.017
## Chain 1: 1100 148.899 1.819 0.013 MAY BE DIVERGING... INSPECT ELBO
## Chain 1: 1200 151.582 0.110 0.013
## Chain 1: 1300 153.429 0.027 0.012
## Chain 1: 1400 146.902 0.013 0.012
## Chain 1: 1500 152.777 0.017 0.013
## Chain 1: 1600 150.496 0.017 0.015
## Chain 1: 1700 150.406 0.016 0.015
## Chain 1: 1800 149.891 0.016 0.015
## Chain 1: 1900 152.208 0.017 0.015
## Chain 1: 2000 150.270 0.016 0.015
## Chain 1: 2100 150.225 0.016 0.015
## Chain 1: 2200 150.175 0.014 0.013
## Chain 1: 2300 145.867 0.016 0.015
## Chain 1: 2400 149.790 0.014 0.015
## Chain 1: 2500 152.800 0.012 0.015
## Chain 1: 2600 150.619 0.012 0.014
## Chain 1: 2700 150.871 0.012 0.014
## Chain 1: 2800 141.569 0.019 0.015
## Chain 1: 2900 149.191 0.022 0.020
## Chain 1: 3000 150.091 0.022 0.020
## Chain 1: 3100 146.830 0.024 0.022
## Chain 1: 3200 150.337 0.026 0.023
## Chain 1: 3300 152.835 0.025 0.022
## Chain 1: 3400 149.625 0.024 0.021
## Chain 1: 3500 152.647 0.024 0.021
## Chain 1: 3600 150.797 0.024 0.021
## Chain 1: 3700 150.945 0.024 0.021
## Chain 1: 3800 150.273 0.018 0.020
## Chain 1: 3900 144.451 0.017 0.020
## Chain 1: 4000 150.139 0.020 0.021
## Chain 1: 4100 152.244 0.019 0.020
## Chain 1: 4200 150.227 0.018 0.016
## Chain 1: 4300 151.294 0.017 0.014
## Chain 1: 4400 151.840 0.015 0.013
## Chain 1: 4500 152.833 0.014 0.012
## Chain 1: 4600 154.142 0.014 0.008 MEDIAN ELBO CONVERGED
## Chain 1:
## Chain 1: Drawing a sample of size 1000 from the approximate posterior...
## Chain 1: COMPLETED.
We can also plot the result of the screen:
Sometimes, the bayesynergy function may return with a warning. Ideally, we don’t want any warnings at all, and they should be examined closely, as posterior samples could be unreliable. Usually, the warning will tell the user how to fix the problem at hand, e.g. by running the chains for longer (set iter higher), or setting adapt_delta higher. See [https://mc-stan.org/misc/warnings.html] for some general tips.
Most commonly, the sampler might complain about divergent transitions. The warning will typically look like this:
## Warning: There were 2316 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
This is indicative of a posterior geometry that is tricky to explore. In the case where there is only a few divergent transitions, the usual trick is to set adapt_delta to a higher value, i.e. larger than 0.9 which is the default. This can be done through the control option in the bayesynergy call:
However, the case above, where there are 2316 divergent transitions, is indicative of a misspecified model. In my experience, this can happen for a few reasons.
lower_asymptotes = FALSE in the call. Unless one is specifically interested in these parameters, there are no reason to estimate them – the model fit will typically still be good without them.Cremaschi, Andrea, Arnoldo Frigessi, Kjetil Taskén, and Manuela Zucknick. 2019. “A Bayesian Approach to Study Synergistic Interaction Effects in in-Vitro Drug Combination Experiments.” http://arxiv.org/abs/1904.04901.
Flaxman, Seth, Andrew Gelman, Daniel Neill, Alex Smola, Aki Vehtari, and Andrew Gordon Wilson. 2015. “Fast Hierarchical Gaussian Processes.” Manuscript in Preparation.
ONeil, Jennifer, Yair Benita, Igor Feldman, Melissa Chenard, Brian Roberts, Yaping Liu, Jing Li, et al. 2016. “An Unbiased Oncology Compound Screen to Identify Novel Combination Strategies.” Molecular Cancer Therapeutics 15 (6): 1155–62. https://doi.org/10.1158/1535-7163.mct-15-0843.
Plotly Technologies Inc. 2015. “Collaborative Data Science.” Montreal, QC: Plotly Technologies Inc. 2015. https://plot.ly.
Stan Development Team. 2020. “RStan: The R Interface to Stan.” http://mc-stan.org/.
Yadav, Bhagwan, Tea Pemovska, Agnieszka Szwajda, Evgeny Kulesskiy, Mika Kontro, Riikka Karjalainen, Muntasir Mamun Majumder, et al. 2014. “Quantitative Scoring of Differential Drug Sensitivity for Individually Optimized Anticancer Therapies.” Scientific Reports 4 (1). https://doi.org/10.1038/srep05193.